首页> 外文OA文献 >Invariant content-based image retrieval using the Fourier-Mellin transform
【2h】

Invariant content-based image retrieval using the Fourier-Mellin transform

机译:使用傅里叶-梅林变换的基于不变内容的图像检索

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Recent advances in storage, computing and communication technology have created the need for efficient, user-friendly access methods to multimedia archives. In this paper we address the problem of automatically extracting visual descriptions suitable for indexing images and videos in a database. A new method is proposed, and its applicability is shown using a collection of still images extracted from a video archive. Contrarily to classical approaches, which describe different image aspects (e.g. color, shape, texture) separately, we take on a holistic approach, through the use of integral transforms. In this way, a unique multidimensional descriptor is available to represent all image aspects, and the user is not required to combine multiple independent rankings. With respect to other holistic approaches, such as those based on the wavelet transform, we seek a superior robustness to image transformations such as translation, rotation, and scaling. [insert abstract2] Invariance to rotation, translation and scaling has been verified for the ideal case of rigid 2D image transformations, as well as using images that have been transformed through camera motion (pan/tilt/rotation) and zooming effects. An experimental database has been created using various TV news clips. Shots presenting considerable camera motion, zooming, as well as unrestricted subject motion have been detected, and a number of still images have been extracted from each of them, for a total of 2'082 images. This shot-based clustering naturally provides a ground truth for the desired similarity rankings. Experimental results yield on average 67% recall for the 12 top-ranked hits, and 54% precision at 100% recall. This shows that, although the signature is only meant to conceal rigid 2D euclidean transformations, it is highly resilient to much more complex transformations (projection, arbitrary subject motion, subject appearance/disappearance), and seemingly captures perceptually relevant image features.
机译:存储,计算和通信技术的最新进展引起了对多媒体档案的高效,用户友好访问方法的需求。在本文中,我们解决了自动提取适合于索引数据库中图像和视频的视觉描述的问题。提出了一种新方法,并使用从视频档案中提取的静止图像集合来显示其适用性。与经典方法相反,经典方法分别描述了不同的图像方面(例如颜色,形状,纹理),我们通过使用积分变换来采用整体方法。以此方式,唯一的多维描述符可用于表示所有图像方面,并且不需要用户组合多个独立的排名。关于其他整体方法,例如基于小波变换的方法,我们寻求对图像变换(例如平移,旋转和缩放)的出色鲁棒性。 [插入摘要2]对于刚性2D图像变换的理想情况,以及使用通过摄像机运动(平移/倾斜/旋转)和缩放效果变换的图像,已经验证了旋转,平移和缩放的不变性。使用各种电视新闻剪辑创建了一个实验数据库。已检测到表现出相当大的相机运动,变焦以及不受限制的主体运动的拍摄,并且从其中的每一个中提取了许多静止图像,总共2'082图像。这种基于镜头的聚类自然为所需的相似性排名提供了基础事实。实验结果显示,对12个排名最高的匹配项,召回率平均为67%,在100%召回率下的准确度为54%。这表明,尽管签名仅旨在隐藏刚性的2D欧几里德变换,但它对更复杂的变换(投影,任意对象运动,对象出现/消失)具有很高的弹性,并且似乎捕获了感知上相关的图像特征。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号